Singapore Dedicated Server Bandwidth (Uplink) | Network latency | Environment monitoring
Xssist™ Group Pte Ltd Singapore Dedicated Servers Client Testimonials Blog Community Frequently Asked Questions Contact Page
Services
Singapore Dedicated Servers
Control Panel System
Control Panel System
Xssist Blog

[Perl] Opening text files for reading, and simple regexp (regular expressions)

Perl is easy to use, and pretty good for working on text files.

If you have a text file named records.txt with the following,

Name,Class,Subject,Score
Alvin,1A1,Maths,80
Alvin,1A1,Science,92
Alvin,1A1,English,85
Roy,1A1,Maths,80
Roy,1A1,Science,92
Roy,1A1,English,92
Susan,1A2,Maths,85
Susan,1A2,Science,85
Susan,1A2,English,85

Following is an example for opening the file for reading, and printing out all lines in the file.

----- begin example -------
#!/usr/bin/perl
open FILE, "records.txt";

while ($line=<FILE>){
print $line;
}
----- end example -------

Instead of FILE, you can use FILEIN, MYFILE, or STUDENT etc.
What if you want the filename to be specified at runtime, instead of hardcoded in the script?
use shift, as shown in the following example,

----- begin example -------
#!/usr/bin/perl

$filename=shift;
open STUDENT, $filename or die "error opening $filename\n";

while ($line=<STUDENT>){
print $line;
}
----- end example -------

if the script is named example.pl, it can be executed as follows, ./example.pl records.txt

In the example above, I also sneaked in the use of "die"; If the open fails, print out a error message, and end the script.

The use of "or" is similar to following
if (! (open STUDENT, $filename) ){
die "error opening $filename\n";
}

i.e. if whatever happens on the left of "or" does not return 1, evaluate whatever is on the right hand side of "or".

In case you are not familar with \n, it is a linefeed.

regexp
-----
To search for a string, eg. "Science", you can read in each line, and use pattern matching. Following code opens file records.txt, reads in each line, check for "Science", and prints out the line if it matches.

----- begin example -------
#!/usr/bin/perl
open FILE, "records.txt";

while ($line=<FILE>){
if ($line=~/Science/){
print $line;
}
}
----- end example -------

How about printing out each line, split into Name, class, subject, score?

We know the format of the file, each line contains name, class, subject and score, delimited by ",". A regexp of ^(.*?),(.*?),(.*?),(.*?)$ will allow us to extract 4 variables from the line.

^ matches beginning of line
(.*?), matches one or more characters till ",", and assigns the matched string into a variable
$ matches end of line

----- begin example -------
#!/usr/bin/perl
open FILE, "records.txt";

while ($line=<FILE>){
if ($line=~/^(.*?),(.*?),(.*?),(.*?)$/){
print "Name:$1 Class:$2 Subject:$3 Score:$4\n";
}
}
----- end example -------

Xssist
29 Jun 2010

[Sysadmin] Access to servers via mobile device and ssh
[Sysadmin] RAID 0 scaling on SCSI U320, Bonnie++ 1.93c benchmark results
[Sysadmin] TODO (Apr 2007)
[Sysadmin] Recover from mistakes in /etc/fstab or e2label usage
[Sysadmin] Server overloaded?
[Sysadmin] Server load high: CPU bound
[Sysadmin] Smokeping: deluxe latency measurement tool
[Sysadmin] Smokeping
[Sysadmin] Jul 08 to Oct 08 updates
[Sysadmin] Weak link - downtimes caused by the organic being
[Sysadmin] BIOS upgrades - uniflash - hotflash
[Sysadmin] Sizing for Virtual Private Server (VPS) & SSDs
[Sysadmin] iphone, ipod - bluetooth keyboard - Nokia e51
[Sysadmin] e2label, fdisk, /etc/fstab, mount, linux rescue, rescue disk, CentOS
[Sysadmin] opensuse, fix waiting for mandatory device, eth0, eth1, eth2, eth3
[Sysadmin] mount: could not find filesystem '/dev/root'
[Sysadmin] Parallels Virtuozzo Physical server to Container migration (vzp2v)
[Web hosting] DDOS (Distributed Denial of Service)
[Web hosting] Uptime for dedicated server, VPS and shared server
[Web hosting] Shared, Guaranteed and Dedicated Bandwidth
[Web hosting] Unmetered bandwidth
[Web hosting] Free domains?
[Web hosting] Joomla Scalability
[SPAM handling] Tracking applications which are exploited for mass spam mailing
[Buzzwords] Clusters, Clustering
[Security] Destruction of faulty hard disks
[Storage] Benchmark using iometer on linux
[SSD] Benchmark Intel X25-E and Intel X25-M flash SSDs
[SSD] Intel X25-E 64GB G1, 4KB Random IOPS, iometer benchmark
[SSD] Intel X25-M 160GB G2, 4KB Random IOPS, iometer benchmark
[SSD] Comparison of Intel X25-E G1 vs Intel X25-M G2
[cPanel] ClamAV version has reached End of Life! Please upgrade to version 0.95
[cPanel] How to install Java, ImageMagick and ffmpeg
[Perl] Opening text files for reading, and simple regexp (regular expressions)